Worcester County
- North America > United States > Massachusetts > Worcester County > Worcester (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Vision > Face Recognition (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Generation (0.64)
PCMind-2.1-Kaiyuan-2B Technical Report
Luo, Kairong, Sun, Zhenbo, Shi, Xinyu, Chen, Shengqi, Yu, Bowen, Chen, Yunyi, Dang, Chenyi, Tao, Hengtao, Wang, Hui, Liu, Fangming, Lyu, Kaifeng, Chen, Wenguang
The rapid advancement of Large Language Models (LLMs) has resulted in a significant knowledge gap between the open-source community and industry, primarily because the latter relies on closed-source, high-quality data and training recipes. To address this, we introduce PCMind-2.1-Kaiyuan-2B, a fully open-source 2-billion-parameter model focused on improving training efficiency and effectiveness under resource constraints. Our methodology includes three key innovations: a Quantile Data Benchmarking method for systematically comparing heterogeneous open-source datasets and providing insights on data mixing strategies; a Strategic Selective Repetition scheme within a multi-phase paradigm to effectively leverage sparse, high-quality data; and a Multi-Domain Curriculum Training policy that orders samples by quality. Supported by a highly optimized data preprocessing pipeline and architectural modifications for FP16 stability, Kaiyuan-2B achieves performance competitive with state-of-the-art fully open-source models, demonstrating practical and scalable solutions for resource-limited pretraining. We release all assets (including model weights, data, and code) under Apache 2.0 license at https://huggingface.co/thu-pacman/PCMind-2.1-Kaiyuan-2B.
- Europe > Austria > Vienna (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (16 more...)
Identifying environmental factors associated with tetrodotoxin contamination in bivalve mollusks using eXplainable AI
Schoppema, M. C., van der Velden, B. H. M., Hürriyetoğlu, A., Klijnstra, M. D., Faassen, E. J., Gerssen, A., van der Fels-Klerx, H. J.
Since 2012, tetrodotoxin (TTX) has been found in seafoods such as bivalve mollusks in temperate European waters. TTX contamination leads to food safety risks and economic losses, making early prediction of TTX contamination vital to the food industry and competent authorities. Recent studies have pointed to shallow habitats and water temperature as main drivers to TTX contamination in bivalve mollusks. However, the temporal relationships between abiotic factors, biotic factors, and TTX contamination remain unexplored. We have developed an explainable, deep learning-based model to predict TTX contamination in the Dutch Zeeland estuary. Inputs for the model were meteorological and hydrological features; output was the presence or absence of TTX contamination. Results showed that the time of sunrise, time of sunset, global radiation, water temperature, and chloride concentration contributed most to TTX contamination. Thus, the effective number of sun hours, represented by day length and global radiation, was an important driver for tetrodotoxin contamination in bivalve mollusks. To conclude, our explainable deep learning model identified the aforementioned environmental factors (number of sun hours, global radiation, water temperature, and water chloride concentration) to be associated with tetrodotoxin contamination in bivalve mollusks; making our approach a valuable tool to mitigate marine toxin risks for food industry and competent authorities.
- Europe > Netherlands > Zeeland (0.25)
- Atlantic Ocean > Mediterranean Sea > Adriatic Sea (0.04)
- Europe > United Kingdom > England (0.04)
- (7 more...)
Towards Heterogeneous Quantum Federated Learning: Challenges and Solutions
Rahman, Ratun, Nguyen, Dinh C., Thomas, Christo Kurisummoottil, Saad, Walid
Quantum federated learning (QFL) combines quantum computing and federated learning to enable decentralized model training while maintaining data privacy. QFL can improve computational efficiency and scalability by taking advantage of quantum properties such as superposition and entanglement. However, existing QFL frameworks largely focus on homogeneity among quantum \textcolor{black}{clients, and they do not account} for real-world variances in quantum data distributions, encoding techniques, hardware noise levels, and computational capacity. These differences can create instability during training, slow convergence, and reduce overall model performance. In this paper, we conduct an in-depth examination of heterogeneity in QFL, classifying it into two categories: data or system heterogeneity. Then we investigate the influence of heterogeneity on training convergence and model aggregation. We critically evaluate existing mitigation solutions, highlight their limitations, and give a case study that demonstrates the viability of tackling quantum heterogeneity. Finally, we discuss potential future research areas for constructing robust and scalable heterogeneous QFL frameworks.
- North America > United States > Virginia (0.05)
- Asia > India > Karnataka > Bengaluru (0.04)
- Oceania > Australia (0.04)
- (6 more...)
Human-Robot Collaboration for the Remote Control of Mobile Humanoid Robots with Torso-Arm Coordination
Boguslavskii, Nikita, Genua, Lorena Maria, Li, Zhi
Personal use of this material is permitted. Abstract -- Recently, many humanoid robots have been increasingly deployed in various facilities, including hospitals and assisted living environments, where they are often remotely controlled by human operators. Their kinematic redundancy enhances reachability and manipulability, enabling them to navigate complex, cluttered environments and perform a wide range of tasks. However, this redundancy also presents significant control challenges, particularly in coordinating the movements of the robot's macro-micro structure (torso and arms). Therefore, we propose various human-robot collaborative (HRC) methods for coordinating the torso and arm of remotely controlled mobile humanoid robots, aiming to balance autonomy and human input to enhance system efficiency and task execution. The proposed methods include human-initiated approaches, where users manually control torso movements, and robot-initiated approaches, which autonomously coordinate torso and arm based on factors such as reachability, task goal, or inferred human intent. We conducted a user study with N=17 participants to compare the proposed approaches in terms of task performance, manipulability, and energy efficiency, and analyzed which methods were preferred by participants. Human-robot collaborative (HRC) control enables humans and robot autonomy to complement each other and improve overall robotic manipulation performance.
- Research Report > New Finding (0.49)
- Research Report > Experimental Study (0.49)
SAVeD: Semantic Aware Version Discovery
Our work introduces SAVeD (Semantically Aware Version Detection), a contrastive learning-based framework for identifying versions of structured datasets without relying on metadata, labels, or integration-based assumptions. SAVeD addresses a common challenge in data science of repeated labor due to a difficulty of similar work or transformations on datasets. SAVeD employs a modified SimCLR pipeline, generating augmented table views through random transformations (e.g., row deletion, encoding perturbations). These views are embedded via a custom transformer encoder and contrasted in latent space to optimize semantic similarity. Our model learns to minimize distances between augmented views of the same dataset and maximize those between unrelated tables. We evaluate performance using validation accuracy and separation, defined respectively as the proportion of correctly classified version/non-version pairs on a hold-out set, and the difference between average similarities of versioned and non-versioned tables (defined by a benchmark, and not provided to the model). Our experiments span five canonical datasets from the Semantic Versioning in Databases Benchmark, and demonstrate substantial gains post-training. SAVeD achieves significantly higher accuracy on completely unseen tables in, and a significant boost in separation scores, confirming its capability to distinguish semantically altered versions. Compared to untrained baselines and prior state-of-the-art dataset-discovery methods like Starmie, our custom encoder achieves competitive or superior results.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Worcester County > Worcester (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (6 more...)
Mobile Jamming Mitigation in 5G Networks: A MUSIC-Based Adaptive Beamforming Approach
Holguin, Olivia, Donati, Rachel, Natanzi, Seyed bagher Hashemi, Tang, Bo
Abstract--Mobile jammers pose a critical threat to 5G networks, particularly in military communications. This paper investigates an anti-jamming framework that enhances a strong adaptive beamforming baseline comprising Multiple Signal Classification (MUSIC) for Direction-of-Arrival (DoA) estimation and Minimum V ariance Distortionless Response (MVDR) for interference suppression with a lightweight machine learning (ML) model for predictive error correction. Extensive simulations in a realistic highway scenario demonstrate that the integrated system achieves a high DoA estimation accuracy of up to 99.8% and an average Signal-to-Noise Ratio (SNR) improvement of 9.58 dB. Analysis reveals that the MUSIC-MVDR baseline alone accounts for the vast majority of this performance gain (9.46 dB), indicating that the primary benefit of the simple ML model lies in correcting outlier estimates rather than providing a substantial systemic SNR increase. The framework's computational efficiency validates the effectiveness of the core beamforming approach and highlights the critical trade-off between ML model complexity and practical performance gains for securing 5G communications in contested environments.
- North America > United States > Massachusetts > Worcester County > Worcester (0.04)
- North America > Cuba > Holguín Province > Holguín (0.04)
- Telecommunications (0.71)
- Health & Medicine (0.47)
The CHASM-SWPC Dataset for Coronal Hole Detection & Analysis
Beck, Cutter, Smith, Evan, Katuwal, Khagendra, Kafle, Rudra, Whitehill, Jacob
Coronal holes (CHs) are low-activity, low-density solar coronal regions with open magnetic field lines (Cranmer 2009). In the extreme ultraviolet (EUV) spectrum, CHs appear as dark patches. Using daily hand-drawn maps from the Space Weather Prediction Center (SWPC), we developed a semi-automated pipeline to digitize the SWPC maps into binary segmentation masks. The resulting masks constitute the CHASM-SWPC dataset, a high-quality dataset to train and test automated CH detection models, which is released with this paper. We developed CHASM (Coronal Hole Annotation using Semi-automatic Methods), a software tool for semi-automatic annotation that enables users to rapidly and accurately annotate SWPC maps. The CHASM tool enabled us to annotate 1,111 CH masks, comprising the CHASM-SWPC-1111 dataset. We then trained multiple CHRONNOS (Coronal Hole RecOgnition Neural Network Over multi-Spectral-data) architecture (Jarolim et al. 2021) neural networks using the CHASM-SWPC dataset and compared their performance. Training the CHRONNOS neural network on these data achieved an accuracy of 0.9805, a True Skill Statistic (TSS) of 0.6807, and an intersection-over-union (IoU) of 0.5668, which is higher than the original pretrained CHRONNOS model Jarolim et al. (2021) achieved an accuracy of 0.9708, a TSS of 0.6749, and an IoU of 0.4805, when evaluated on the CHASM-SWPC-1111 test set.
- Europe > United Kingdom (0.14)
- North America > United States > North Dakota > Oliver County > Center (0.05)
- North America > United States > New Mexico > Doña Ana County > Las Cruces (0.04)
- (2 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Oceania > Australia (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- (5 more...)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Massachusetts > Worcester County > Worcester (0.04)